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Date- June 17, 1974 
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MEMORANDUM FOR FILE 

LIL (a Little Implementation Language) is designed to help you write machine level code for 
the PDP-11. It looks like a high level language, because it is one (very much like C in fact); but it 
deals directly with registers, indexing, and all the other things you must keep in mind while writ¬ 
ing assembly code. The major advantages it offers over assembly language are that you are en¬ 
couraged to do some things in ways that have proved to be more reliable, and that the final pro¬ 
gram is very readable. 

1. Expressions 

We begin with simple expressions, which form the meat of a program. The PDP-11 code to 
form the sum of x, y, and z in register 0, and the sum of x and y in w is: 
mov x,r0 
add y,rO 
mov rO,w 
add z,rO 

In LIL you could write 
r0 * x; 
rO + y; 
w = rO; 
rO + z; 

but you are also allowed to say 

rO - x)¥ y -> w + z; 

or even 

w = (rfl = x + y); rO + z; 
to get the same effect. 

The idea is that expressions are evaluated left to right Everytime you see the sequence 

operand operator operand 

the appropriate PDP-11 instruction is generated and operator operand is discarded, leaving just the 
left operand for further use. All operators are of equal binding strength in LIL — assignments 
(moves), adds and multiplies are produced in the order you specify. Only the use of parentheses 
will modify this strict left-to-right evaluation, as in the last example. There the rule is — evalu¬ 
ate any expressions inside parentheses, first for the left operand, then for the right, and take as 
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the operand the leftmost thing inside the parentheses. This rule can be applied to any depth of 
nesting. 

There are a number of different operators defined in LIL, enough to let you specify any data 
manipulation instruction in the PDP-11. All of the operators are tabulated toward the end of this 
memo. Meanwhile, we will just explain any new ones as we happen across them. 

2. Conditions 

To get a little fancier, let us compute the maximum of x and y in rO: 
mov x,iO 
cmp rO,y 
bge If 
mov y,rO 

1: 

This is written: 

if (rO = x < y) rO - y; 

The relational operator < specifies what condition you wish to test for. The compiler will gen¬ 
erate a cmp or tst instruction, if necessary, and figure out the best way of branching to achieve 
the desired flow of control. If you want to compare 16-bit unsigned addresses for ‘lower’ or 
‘higher,’ use << or >>, respectively, instead of < or >. Equality testing is specified by ==, 
and inequality by 

If you have reason to believe that the condition code is already properly set, you can just 
write >=», for example, to specify the condition ‘greater than or equal to.’ There are also key¬ 
words for each of the four condition code bits — minus, zero, oflow, and carry. In conjunction 
with the not operator * it is possible to specify any machine defined condition: 
rO = x + y; 
if ( < ) rO — rO; 
if Coflow) z = rO; 
else z = 32767; 

The =— operator puts the negative of rO in rO, i.e. neg rO. 

3. Boolean Connectives 

Tests are often much more complicated. An alphabetic symbol in LIL, for instance, is any 
upper or lower case letter, dot, or underscore. To set rO to 1 if it contains an alphabetic charac¬ 
ter, and 0 otherwise: 


cmp rO,'a 

/ is it between 'a' 

bit If 


cmp rO,'z 

/ and ’i 

ble 2f 


cmp rO,'A 

/ or between 'A' 

ble If 


cmp rO,'Z 

/ and 'Z' 

ble 2f 


cmp rO,'. 

/ or is it 

beq 2f 


cmp r0,'_ 

/ or 

bne If 


mov $1,H) 

/ 1 to i0 if so 

br 2f 



O' 
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p 


1: clr rO / 0 otherwise 

2: 

In LIL this becomes 

if ('a' <= rO && rO <= 'i 

II 'A' <= it) && rO <= 'Z' 

II rO == 

II rO == rO = 1; 

else rO = 0; 

The operators && and II let you combine conditionals in a fairly obvious fashion. If the left con¬ 
dition before a && is false, the combination must be false, so the right condition is skipped. 
Similarly, a true condition before II lets you skip evaluation of the following condition. If the 
right condition must be evaluated, then it determines the truth of the combination. && binds 
tighter than II, the way we are accustomed to reading logical statements. Again the tests are per¬ 
formed, as needed, strictly left to right. 

You can also use parentheses and the * operator to alter the normal binding order or invert 
the meaning of any part of a test. 

Remember that two equal signs in a row are used to specify that you are testing for equali¬ 
ty, because one equal sign alone always means ‘assign.’ Had the test ended as 
II rO - % WRONG!! 

it would have meant, ‘or, copy the character _ into iO and if rO is subsequently nonzero, the test 
is true.’ This clobbers rO and makes the test always come out true — so watch out. (That % 
marks a comment, by the way. Everything from the % through the end of the line is ignored by 
the compiler.) 

You might also observe that we wrote 
rO = 0; 

where we wanted a clr rO instruction. You can trust the compiler to nearly always generate the 
best instruction for the job, so there is no special notation for clr. Thus, should you change 
some parameter to zero and recompile, clrs will appear wherever possible. Similarly inc, dec, 
and quite a few other special purpose instructions are generated automatically. The tabulation of 
operators shows when they are used. 

4. Else 

As the last example showed, you can always follow an if by an else, which specifies what 
action to take if the test is false. The else part is skipped over when the test is true. 

An important use of the else clause is to string together a series of tests to make what is 
known as a ‘case switch’ in some languages. This is a multi-way if that does at most one of 
several different cases. LIL permits you to write certain character constants (between single 
quotes), for instance, using ‘escape’ sequences to make invisible characters apparent. \t becomes 
a tab character, \n a newline. The code to perform this conversion looks something like: 

if (rO == O') rO = 000; 
else if (iO == a ) rt) = 006; 
else if (iO == e') rO = 004; 
else if (rO — n ) rO - 012; 
else if (iO == p ) rO = 033; 
else if (iO =*= 'r') rO =* 015; 
else if (rO — t ) rO - Oil; 
else ; % default is \x = x; or iO * rO; 
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The first test that succeeds, reading down from the top, causes the corresponding action to 
be performed; execution then resumes following the last else. In this case, the last else has no 
test and so becomes the default action — what to do if none of the tests succeeds. Since the de¬ 
fault action here is the null statement the else could have been left off. It was written as a 
reminder, however, which is often a good idea. No harm is done because the compiler is smart 
enough not to generate extraneous code. 

Those numbers that begin with 0 in the example above are octal. 

Since the statement controlled by the if can be anything, including another if, a natural am¬ 
biguity arises when the else option is used. When you say 
if (a) if (b) 
else 

to which if does the else belong? The compiler arbitrarily resolves the matter by matching the 
else to the second if. This is not what the indentation implies! To get the flow of control 
correct, you would have to add a null else: 
if (a) if (b) 
else ; 

else 

There is no need for this rather ugly form, however, since you can more easily write 

if (a && b) 
else 

and get the desired effect. If you really want the other binding of the else (which is also an ugly 
way of coding), at least make the if-else into a group. 

5. Groups 

Just as you can group parts of an expression with parentheses, so too can you group state¬ 
ments together. Use braces: 

if (x < y) 

{rO = x; % exchange x and y 
x = y; 
y = rO; ) 

Now the if controls the entire group instead of just one of the statements. Groups may appear 
anywhere you see an expression ending in a semicolon (which is one of the simplest statements 
allowed), including inside other groups. Note that the closing brace occurs after the last sem¬ 
icolon (and no semicolon is written after the braces). Semicolons end things, they don’t separate 
them. 

6. Loops 

Loops are another important type of control structure. They often control a group of 
statements: 

rO — 0; 

while (ifi < nitems) 

{alrOl = b(r0| - clrOl + 1; 
rO + 2; } 

The square brackets indicate indexing, with respect to H). In this form of the loop statement, the 
test is made at the top. That way, if nitems happens to be zero, the body of the loop is safely 
skipped, which is usually what we want to happen. If the test is met, the statement following 
the test, which happens to be a group, is executed and control returns to the test. The test will 
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eventually fail, in this case, and execution will continue with the statement following the loop. 
Loops can also take another form: 
rO = 0; 

do { areallrOI = 0; 
area2(r0| = 0; 

} while (rO + 2 < 500); 

This clears 250 elements of the arrays areal and area2. Since we know that the loop will be 
obeyed at least once, we can put the test at the end and get a tiny reduction in the number of 
branches required. More important, we can state the increment and test a little more compactly. 

The do-while form can also be followed by a non-null statement (or group, of course) in¬ 
stead of just a semicolon: 

rO = listhead; 
rl = 0; 

do rl + 1; 

while (nextlrOI) 
rO = nextlrOI; 

% rO points at last element 

It is rare that you need a loop with the test somewhere in the middle — in fact you should make 
sure you really do before writing it that way — but the facility is there if you need it. 

7. Break, continue, and goto 

The break statment lets you exit from a loop at other places besides the test: 
rO - 0; 

do if (alrOl "= blrOI) break; 

while (rO + 2 < 500); 

As soon as corresponding elements of the arrays a and b are found that do not match, the loop 
is exited. A loop may contain any number of breaks. 

Sometimes you want to go back to the top of a loop from several places. Use continue: 
while (true) 

{-getcharO; 

if (rO == ' II rO == '\t' II rO — V) 
continue; 

The keyword true specifies a condition that is always met. It is used in this case to form an 
‘infinite’ loop with no tests. -getcharO is a subroutine call — more on subroutines later (and the 
reason for that silly _) — to a routine that returns an input character in rO. Thus this loop 
header skips all blanks, tabs, and newlines on input. 

You should be able to specify nearly all your control flow with if, else, do, while, break, 
and continue. If you make a habit of doing so, in fact, you will find your code much easier to 
read, and less likely to contain bugs. Occasionally, however, you may find a need for more direct 
control. The goto statement lets you specify an arbitrary br or jmp instruction: 
goto jail; 

goto brtablelrOI; % table of branches 

goto latablelrOll; % table of branch addresses 

As the last form shows, indirection is specified by square brackets in a way analogous to index¬ 
ing. 
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8. Addressing 

We have seen a number of different kinds of operands, implying different modes of ad¬ 
dressing. Now is a good time to complete the list. 

Every operand (symbol, literal, expression) in LIL has a number of attributes besides value. 
One of the most important is its type, which corresponds very closely to legitimate addressing 
modes on the PDP-11. A constant such as 3, for instance is assumed to be an immediate reference 
to a literal 3. If you have to refer to absolute memory location 3, write mem 3. Similarly, reg 3 is 
equivalent to the predefined symbol r3, or register 3. 

On the other hand, you sometimes want to get hold of the address of a memory reference, 
instead of its contents. &x has the same value as x, but is of type immediate: 
iO = &x; rl = 0; 
while (I 1 OI++) rl + 1; 

This counts the number of non-zero entries in the array x, up to but not including the first zero. 
The notation lifi|++ specifies autoincrement addressing; The ++ comes after the brackets to re¬ 
mind you that incrementing occurs after the reference is used. It is the only operator that fol¬ 
lows operands. 

To put r0-r5 on the stack: 

—(spl = rO =■ rl =* r2 =* r3 = r4 =* r5; 

Since the first operand remains unchanged, autodecrement occurs on every assignment. 

As we saw earlier, square brackets indicate indexing, as in xlrOl, indirection, fxl, or both, 
IxlrOll. Any valid PDP -11 addressing mode is acceptable, even KrOll (which becomes [OlrOll). And, 
of course, expressions are allowed inside brackets: 
alrt) - 1] = Wrl - j + 21; 


9. Compile-time expressions 

Most assemblers allow you to do a limited amount of arithmetic with symbols. To move 
the contents of location b+2 into iO, for instance, you just write 
mov b+2,r0 

LIL uses operators like + and —, however, to specify machine instructions. So a special notation 
is used to show which computations are to be done at compile-time: 
iO - "b + 2"; 

Expressions inside double quotes generate no machine instructions, but are computed 
directly from the values of symbols and literals. You can look on compile-time expressions as 
instructions to a special machine whose storage locations are the symbol table entries. Most of 
the run-time operators have analogous compile-time meanings. 

Compile-time expressions have a slightly different interpretation than the run-time expres¬ 
sions used to specify PDP-11 code. We want 
b + 2 —> iO; 

to add 2 to location b, then move the contents of b into rO. But 
"b + 2* -> rO; 

does not redefine the symbol b as having a value bigger by 2 than before — the expression sim¬ 
ply refers to the value of b and leaves the symbol alone, which is what we usually want. So this 
form, as before, moves the contents of location b+2 into iO. 


si 
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To define a symbol, i.e. actually set its value, use the compile-time assignment: 

"x - b + 2"; 

This defines x initially as having the value (and other attributes of) b, adds 2 to that, then closes 
x for any further redefinition, with the end of the compile-time expression. Hereafter we can 
write 

rO = x; 

to move the contents of b+2 into rO. 

Type and other attributes are also set in compile-time assignments: 

"isp = Ispl"; "sink = —isp"; 
sink = rO = rl = r2 = r3 = r4 = r5; 

As you can see, addressing modes can be built up in stages. 

10. Labeled Groups 

The usual way to define a symbol is with a labeled group: 
x{ } 

This notation causes several things to happen: 

1) The code between braces is a local region — symbols may be defined inside it with the 
same names as those in other regions, or even in containing regions, without conflict. Lo¬ 
cal definitions disappear after the close of the group. 

2) break and continue statements may be used to skip to the bottom or top, respectively, of 
the group, just as with a loop. If the statement has a label, as in break x, the break (or 
continue) ignores all containing loops and labeled groups whose labels do not match the la¬ 
bel in the statement, and instead applies to the innermost containing group whose label 
matches. 

3) The label x is defined (in the containing region, not just local to the group) as being a 
memory reference whose value is the location counter at the start of the group, and whose 
size is the number of bytes of code inside the group (more on both the location counter and 
the size attribute later). 

Labeled groups are used as the bodies of subroutines, or of important subunits of routines, 
or just to define data: 

arrayfl; 1; 2; 5; 8; 11; } 
scalar{—1; } 

label {; } % just define label 

Note that you can generate constants in line simply by writing them. (This is how you sneak in 
special instructions like wait and iot.) 

When used as a local region, it is best to head the group with a local statement: 
x{ "local a, b, c"; 

...;} 

This declares a, b, and c to be symbols that will be defined inside the group, and that should be 
forgotten on exit from the group. 

If not explicitly declared, a symbol is made local to the first group in which it is referenced. 
Should the group end before it is defined, the symbol is promoted into the next outer containing 
group. Any symbols undefined at the end of a program file are published as undefined externals. 
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11. Subroutines 

No special code is generated at either the start or the end of a labeled group, even one used 
as a subroutine body. Thus there are no preconceived notions about how you must build 
subroutines, which is by way of saying that you are obliged to write entry and exit sequences ex¬ 
plicitly. 

For convenience, however, there is a presumptuous mechanism for calling routines. 
Designed to be compatible with code produced by and for the language C, the two ways of calling 
a routine are: 

fun(); % no arguments 

fun2(argl, arg2, ifl = arg3, arg4 + 1); 

The code generated is 

jsr pc,fun / first call 

mov arg3,r0 / arg expressions 
inc arg4 / left—to—right 

mov arg4,—(sp) / args moved 

mov rO,—(sp) / onto stack 

mov arg2,—(sp) / right—to—left 
mov argl,—(sp) 
jsr, pc,fun2 / second call 

add $8,sp / args popped 

The arguments are moved onto the stack in reverse order so that 2[sp| always refers to the first 
argument, 4(spl the second, and so forth. That way, if the wrong number of arguments are 
passed, the first ones still match up properly and no major harm is done. 

To receive this sequence takes no special effort, although you might wish to use the C li¬ 
brary routine 

jsr r5; rsave; n; 

to save nonvolatile r2-r5, reserve n bytes on the stack, and leave r5 pointing at the argument list 
(so argl is at 4(r51, arg2 at 6|r51, etc.). You would then leave, after putting any integer result in 
rO, by 

goto rretrn; 

That jsr r5 is no mistake — LIL treats jsr as a unary operator, similar to - or ", which 
changes a register into an immediate that looks like the first word of a 
jsr r5,*$subr 

instruction. To return from a subroutine, write (you guessed it) 
rts pc; 

And, to complete the record, you can make system calls with a 
sys n; 

where n is an immediate. 

For added convenience, lil treats c compatible function calls as references to register 0: 
if (alflc)) x = c; 

becomes 

mov c,—(sp) 
jsr pc,alf 

tst (spH- % sneaky form of sp+2 
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tst rO 
be If 
mov c,x 

1 : 

And if you plan to use a large number of these function calls, you may wish to do as the c 
compiler does — reserve a temporary location of top of the stack, Ispl, to hold the last argument 
on all calls. Thus 

fun(c) 

becomes 

mov c,*sp 
jsr pc, fun 

in general saving an autodecrement and an add on each call with arguments. To turn on this 
feature, declare a local version of the flag symbol .temp and set it nonzero. 

12. More declarations 

If you want your function names, or any others, to be used to satisfy references in other 
compilations, use 

"extern a, b, c"; 

extern is an additive attribute — it can apply to symbols also declared local. In fact, if you wish 
to refer only in some sub-region to an undefined external, you must declare it in both local and 
extern statements inside that region. 

In case you have to use any of the lil keywords to communicate with, say, assembler 
routines, you can turn off their special meaning, once and for all, with an escape: 

"extern \if"; 

"xif - if"; 

This is best done, as shown, at the end of a program file, to define the alias used in the body of 
the code. 

A useful variant of the local statement looks like a compile-time function call: 

"0[sp|(a, b, c sizeof 4, d, e)"; 

The effect of this statement, for each argument from left to right, is to 

1) declare the argument to be local and define it to be equal to the label (originally Olspl). 

2) increment the value of the label by the size of the argument (e.g. Olspl becomes 2lspl). De¬ 
fault size for all symbols is 2, but the sizeof operator can be used to specify another size. 

If used as an operand, this form thus has a value equal to its original value plus the size of 
everything declared, in this case 12 (since the sizeof operator explicitly declared c to be of size 4). 

This form is most useful in defining names for automatic variables (reserved on the stack 
every time a routine is entered), in conjunction with the c entry sequence 
subr{ jsr rS; rsave; sizeof "0(sp|(a, b, c sizeof 4, d, e)"; 

As a unary operator, sizeof changes its operand into an immediate reference whose value is the 
size of the operand, so the proper number of bytes to be reserved will be generated in line. 
Another convenient notation resembles indexing: 
a(2| * rO; % is the same as 
"a + (2 * sizeof a)" = rO; 
where the * specifies compile-time multiplication. 
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13. Load control 

Sizes are not used by LIL to generate code — mostly the compiler keeps track of them for 
handy reference, as above. If a symbol remains undefined, however, its size is published along 
with its name as an undefined external. The UNIX loader treats such a reference with a non-zero 
size as a labeled common block. All references with the same name are collected by the loader 
(keeping track of the maximum size requested), so that sufficient space can be allocated in the bss 
section. 

Generally, you compile code into the text section, compile initialized data into the data sec¬ 
tion, and just layout data areas in the bss section ( bss is a fossil meaning ‘block started by sym¬ 
bol’ in middle low Phoenecian). UNIX promises to zero everything in the bss area, but won’t let 
you help. LIL therefore will smack your fingers if you try to generate code except into the text or 
data areas. 

How do you switch loading among areas? There is a special symbol called dot V which is 
set to the current value of the location counter at the beginning of every statement. You may 
redefine dot yourself, so long as you don’t try to back up over code already generated. It is gen¬ 
erally meaningful to set dot only to one of the other redefinable symbols: .abs, .text, .data, or 
.bss. These are the location counters for absolute (unrelocated) loading, or for one of the relocat¬ 
able sections already discussed. Thus: 

". = .data"{ % switch to data area 

array{l; 1; 2; 5; 8; 11; } 
scalar{—1; } 

label!; } % all defined in data area 

} % switch back 

Using dot to label a group as shown provides for a temporary diversion of loading, since the la¬ 
bel is defined at the close of the group as having the value of the location counter at the start of 
the group. If you want to generate a lot of code into, say, the data area, use the simple switch: 
".=.data"; 

It is not necessary to switch dot unless you plan to generate code (although dot can be 
made to point in many interesting directions). You can use the shorthand 
\bss(a, b, c sizeof 4, d, e)"; 

as before, only this time the location counter for the bss area is updated to make room for the 
variables. Code before and after this statement will go into the text section, as usual, unless oth¬ 
erwise specified. 

It is good practice to keep data that must be altered out of the text section, by the way. 
That way, you can often write-protect the code (hardware permitting) and make your debugging 
job much easier. 

14. Bytes and strings 

We have come a long way without mentioning byte addressing. There is nothing terribly 
magic about it — just use the byte modifier before any operand you wish to be used as a byte 
reference. 

r0 = byte s; 
byte d = byte s; 
byte d = '\0'; 

The rule is — if either operand is of type byte, the other must also be of type byte, unless it is 
an immediate or a register. 
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If you get tired of writing byte, make it part of the symbol definition: 

H .bss(byte name sizeof 8)"; 
rO = 4lr51; 
rl - 0; 

do namelrll = byte Ii0|++; 

while (rl + 1 < sizeof name); 
namelrll = '\0'; 

And if you must ‘undo’ the byte attribute, use word. 

Single quotes are used in LIL to denote character strings of any length. Odd length strings 
are padded to even length by adding a null byte, but otherwise no effort is made to ensure a null 
terminator. Strings up to length two fit in one word and may be used wherever an immediate is 
allowed. Longer strings may be used only to generate ascii text in line: 
table! 0; 'zero'; 

1; 'one '; 

2; two '; } 

or as a literal reference: 

if (debug) _printf(='error message\n'); 

The unary operator = compiles the operand string (or any immediate) into the data section. 
If the odd byte of the last word is not zero, a zero word is appended. The operand becomes an 
immediate whose value is the address of the first word allocated. Thus, the literal notation above 
is equivalent to the C string reference "error message\n". 

15. C compatibility 

Most of the job of interfacing to the c world is handled in the subroutine call/return 
machanism already described. You should be familiar with c and its call by value convention to 
intermix code safely. C insists on prepending a _ to all symbols referenced in that language. 
This avoids conflicts with assembly language support routines, but makes for generally unpretty 
names in LIL. 

If you run under the aegis of the C support library, you must provide a main routine, called 
_main, and should leave gracefully by returning from -main or by calling -exit. 

Be warned that 

_fun(byte a); 

will move only one byte onto the stack, leaving garbage in the high-order byte. C expects a prop¬ 
erly sign-extended integer. 

Also, thanks to the curious way that things are declared, you must be careful with 
undefined external function names. If the size of such a reference is non-zero (and the default is 
2), the UNIX loader will not match the reference up to a text symbol. LIL sets the size attribute to 
zero on any symbol used as the entry name on a c compatible call. But for nonstandard usage, 
you must say 

"extern rsave(), rretrnO"; 
to set entry name sizes to zero. 

Most of all, if you mix languages, watch out for the differences. 
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16. Preprocessor 

lil uses a preprocessor essentially identical to the one provided with c. If the first charac¬ 
ter of a program file is #, the file is scanned for lines that look like 
#include 'pathname' /* the specified file is included */ 

#define SYMBOL string /* all subsequent occurrences of 
SYMBOL are replaced by the 
defining string, surrounded by 
blanks */ 

pl/I style comments are used only in conjunction with preprocessor statements. 

Some useful things to define are: 

#define ENTER jsr r5; rsave; 

#define LEAVE goto rretrn 

#define RETURN rts pc 

#define AUTO Olspl 
#define ARCS 4lr5| 

If you have more than one case at your disposal, it is a good idea to write defined symbols in all 
caps to distinguish them from other symbols. 

17. Conditional compilation 

LIL also provides for conditional compilation, with the aid of compile-time conditions and the 
ordinary conditional statements of the language. Compile-time conditions must evaluate to 
strictly true or false, and may be combined with those keywords to determine whether a piece of 
code should be omitted: 

if ("recsize == 512") getblockO; 
else read(fcb, Abuffer, recsize); 

Depending on the value of the symbol recsize (which must be known), only one of the two 
function calls will be compiled into the final program. The else is, of course, optional: 

if (debug) 

(dcount + 1; 

_printf(—got this far %d\n', dcount); } 

This can be turned on or olf simply by writing 
"debug = true"; 


or 

"debug — false"; 

or it can be made a runtime switch by writing 
".bss(debug)"; 

and setting it on the basis of some input character at run-time. 

Loops can also be turned on or off, from the test onward: 

do % this is compiled 

while (false) 

{...; % this is skipped 

} % no jump back 

for what it may be worth. A useful form, however, is the ‘infinite’ loop we saw earlier. 
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18. Syntax, character set 

By now you probably have a pretty good feel for the syntax of the language, without even 
being told the nuts and bolts rules. Blanks, tabs, and newlines are largely ignored (but a string 
may not contain a newline — you must use the escape sequence \n). It helps if you are careful 
to indent to show nesting depth, as the examples have showed. And you may write names of 
any length, for clarity, but only the first eight characters are meaningful. 

Operators may not be run together as in most languages: 
x —>—Ispl; % is illegal 
x —> —Ispl; % is required 

It is possible to type LIL programs quite readably using just the basic ascii 64-character set. 
Braces can be replaced with ordinary parentheses (since LIL doesn’t distinguish them anyway), * 
for I, and ! for *. The compiler does distinguish between upper and lower case letters, so beware 
of trying to alter a mixed-case program from a one-case terminal. That way madness lies. 

19. Compiling 

To use the LIL compiler, type the UNIX command line 
1c args 

where args are source files if they end in .1, flags if they begin with —, and object files otherwise. 
Flags and loading conventions are just like those for the C compiler (where appropriate see ccfl) 
in the UNIX manual). Some important flags are: 

—p Include ‘profile’ library, which will automatically produce a histogram of code usage 

when program is run (see profll) in UNIX manual). 

—P Preprocess only, writing expanded source on file with .1 changed to .i. 

—c Compile only, writing standard UNIX object code on file with .1 changed to .o. 

Much of this baggage is for use with C world programs. If you are coding entirely in LIL, 
say for another PDP-11, you will probably just use 
/usr/pjp/pdp/lc —c files 

where each of the files is a source file whose name ends in .1. In this case, each file is compiled 
in turn and its object code written on a file with the same name, only ending in .o. You can 
then bind object decks together with the loader (see ld(I)), to make a standard UNIX executable 
file. This is certainly the best format to work from, since it lets you mix in existing assembly 
language code, and anything else that UNIX provides (even Fortran). 

20. Unary operators 

As promised, we now present a complete list of all the operators that LIL recognizes, start¬ 
ing with the unary (one argument) operators. These have the same meaning at compile-time as 
at run-time, for none of them generate code. Instead, they modify references to symbols or ex¬ 
pressions, by changing their type, value, or other attributes. For convenience, we define the fol¬ 
lowing symbols: 

I non-byte memory reference. 

m, n absolute (i.e. unrelocatable) immediate. 

r register. 

v, w any non-byte address type. 
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x, y any PDP-11 address type. 

The unary operators are: 

x++ converts type register indirect to type autoincrement. 

—x converts type register indirect to type autodecrement. 

=n compiles the absolute immediate or string n into the data section, appending a zero word 
if the last byte is not null. Result is type immediate, value is address of start of the allo¬ 
cated string. 

&x converts x to type immediate, removes byte attribute, 
mem x converts x to type memory, removes byte attribute. 

reg x converts x to type register, removes byte attribute, x must be defined absolute with 0 
<= value <= 7. 
word x removes byte attribute, 
byte x attaches byte attribute. 

—n negates absolute immediate n. 

'n complements absolute immediate n. 

rts r converts type register r to an immediate with value of rts r instruction, 
jsr r converts register r to an immediate with value of jsr r,*$ instruction. (Should be fol¬ 
lowed by address of routine.) 

sys n converts immediate n to an immediate with value of sys n instruction, 
sizeof x converts x reference to an absolute immediate with value equal to size attribute of x 
reference. 

Ill n shifts absolute immediate n left 8. 

21. Compile-time binary operators 

x>=y, x>y, x=y, x<y, x<=y, x*=y result is a condition with value true if the relation ob¬ 
tains, otherwise false. Both x and y must have the same relocation bias. Comparisons 
are all on signed quantities. 

x=y, y—>x x is defined to have the same type, value, and other attributes as the y reference, x 
must be previously undefined (unless it is dot or one of the four location counters). 
x=»— n x is defined as the negative of the absolute symbol n. 
x='n x is defined as the complement of the absolute symbol n. 

The type of x is left unchanged in the following: 
x+y result is the sum of x and y. At most one of the two may be relocatable or undefined, 

x—y result is the difference of x and y. If y is relocatable or undefined, it must have the same 

bias as x. 

n’m result is n equivalence m (not exclusive or). 

n”m result is n exclusive or m (not equivalence), 

nlm inclusive or. 
n&m and. 

n&'m same as n&Om). 

n**m n is shifted left (if m>0) or right (if m<0) by I ml. 
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n*m multiplication, 
n/m division. 

nlllm same as (n&0377) I (m**8) [packs bytes], 
x sizeof n result is x with size n. 

22. Run-time binary operators 

These are the creatures that actually generate code. If x and y are not both byte or both 
word, then one of them must be a non-byte register or immediate. The instruction generated 
will then be byte mode if either operand is byte. The following is a metalinguistic description of 
the decisions made by LIL in producing code for each operator. 

x>=y, x>y, x==y, x<y, x<=y, x'=y [compare signedl 
x>>s»y, x>>y, x<<y, x<<—y [compare unsignedl 

if (y==0 && cc set on x) do nothing 

else if (y==0) tst(b) x 

else if (x==0) tst(b) y 

else cmp(b) x,y 

x-y, 

y—>x if (y==0) clKb) x 

else if (x=={carry oflow zero minus) && y=={true false)) 
setx or secx 

else if (y==minus) sxt x 

else mov(b) y,x 


x=—y if (y==0) clr(b) x 

else if (x——y) neg(b) x 


x='y if (x——[carry oflow 

else if (x==y) 

x+y if (y—-1) 

else if (y==carry) 
else 


ro minus) && y=={true false}) 
secx or setx 
com(b) x 

inc(b) x 
adc(b) x 
add y,x 


x-y 

xly 

x&n 

x&'y 

x”r 

r*y 

r/y 


dec(b) x 
sbc(b) x 
sub y,x 

bisCb) y,x 

bic(b) $!n,x 

bic(b) y,x 

xor r,x 

mul y,r 

div y,r 


if (y==l) 

else if (y==carry) 
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x?y if (y=**0) 

tsKb) x 

else 

cmp(b) x,y 

x?&y 

biKb) x,y 

xOn if (n»*l) 

rol(b) x 

else if (n= —1) 

roKb) x 

x<*>n if (n==8) 

swab x 

else if (x odd reg) 

ashc n,x 

x**n if (n==l) 

asl(b) x 

else if (n== —1) 

asrib) x 

else if (x a register) 

r***n ashc n,r 

ash n,x 


Both the sizeof and III operators are also defined at run-time and have the same effect as at 
compile-time. They do not directly cause code to be generated. 

The compiler is aware of what is happening to the condition code most of the time. It 
knows, for instance, that swab and function calls do not leave the code in a state implied by the 
language, and so will generate a tst if the result of either is to be used in a test. The PDP-11 is 
somewhat whimsical about the setting of the carry bit, however, and LIL makes no real attempt 
to second guess the machine. (It makes a difference whether you add 1 or 2 to something, for 
instance.) It is always a good idea to be very careful when testing for special conditions. 

23. Machine instructions 

It is best to learn to think in terms of LIL statements, instead of individual machine instruc¬ 
tions. But to help you make the conversion, here is a list of the PDP-11 orders and how to gen¬ 
erate them one at a time. Symbols are the same as in the previous sections. 


adc(b) x 

x + carry; 

add v,w 

w + v; 

ash r,n 

r ** n; 

ashc r,n 

r *** n; 

asl(b) x 

x ** 1; 

asifb) x 

x ** -1; 

bcc 1 

if ('carry) goto 1; 

bcs 1 

if (carry) goto 1; 

beq 1 

if ( == ) goto 1; 

bge 1 

if ( >» ) goto 1; 

bgt 1 

if ( > ) goto 1; 

bhi 1 

if ( > > ) goto 1; 

bhis 1 

if ( >>“ ) goto 1; 

bic(b) x,y 

y x; 

bis(b) x,y 

y 1 x; 

biKb) x,y 

y ?& x; 

ble 1 

if ( <- ) goto 1; 

bio 1 

if ( < < ) goto 1; 

bios 1 

if ( < <- ) goto 1; 

bit 1 

if ( < ) goto 1; 

bmi 1 

if (minus) goto 1; 

bne 1 

if ( ) goto 1; 











bpl 1 

if ('minus) goto 1; 

bpt 

3; 

br 1 

goto 1; 

bvc 1 

if ('oflow) goto 1; 

bvs 1 

if (oflow) goto 1; 

ccc 

0257; 

clc 

carry = false; 

cin 

minus = false; 

cIKb) x 

x = 0; 

civ 

oflow * false; 

clz 

zero = false; 

cmp(b) x,y 

x ? y; 

com(b) x 

x =~ x; 

dec(b) x 

x - 1; 

div w,r 

w / r; 

emt n 

"014000 + n”; 

halt 

0; 

inc(b) x 

x + 1; 

iot 

4; 

jmp w 

goto w; 

jsr pc,w 

wO; 

jsr r,*$l 

jsr r; 1; 

jsr r,w 


mark n 

”06400 + n"; 

mfpi w 


mov(b) x,y 

y = x; 

mtpi w 


mul w,r 

r * w; 

neg(b) x 

x *— x; 

reset 

5; 

rol(b) x 

x <> 1; 

rortb) x 

x <> -1; 

rti 

2; 

rts r 

rts r; 

rtt 

6; 

sbc(b) x 

x — carry; 

see 

0277; 

sec 

carry = true; 

sen 

minus = true; 

sev 

oflow = true; 

sez 

zero =* true; 

sob 1 


sub v,w 

w — v; 

swab w 

w <*> 8; 

sxt w 

w =* minus; 

trap n 

sys n; 

tst(b) x 

x ? 0; 

wait 

1; 

xor r,w 

w “ r; 


% note inversion 


% not provided 
% not provided 
% not provided 


% not provided 
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24. Keywords and predefined variables 

Keywords are reserved words in LIL, which means you cannot have local variables with the 
same names as keywords (unless you turn off their special meaning with the escape \, as we ex¬ 
plained earlier). Here is a complete list of the keywords: 


break 

goto 

rts 

byte 

if 

sizeof 

continue 

jsr 

sys 

do 

local 

while 

else 

mem 

word 

extern 

Predefined 

reg 

variables, on the other hand, may be superseded by new local definitions. They 

are merely provided for convenience or to make available special services which you may not 

need: 

false 

r3 

.abs 

minus 

r4 

.bss 

oflow 

r5 

.data 

pc 

sp 

.temp 

rO 

true 

.text 

rl 

zero 

carry 

r2 



25. A final note 

Performing compile-time arithmetic with the current location counter, dot or V, is generally 
unsafe, since it is difficult to anticipate how many words will be needed to represent a given se¬ 
quence of instructions. LIL may actually evaluate the same expression differently in different 
places as the compiler learns more about what instructions may be made shorter. Since dot is 
known to be set at the beginning of each statement, there are occasions when it is both safe and 
useful to write compile-time expressions involving dot. One important situation is the indexed 
branch used to implement an efficient ‘case switch’: 


if (MIN <= rO && rO <= MAX) 
switch f 



switch { 

goto + 4 - MIN"|rO|; 

"local caseO, easel, case2, ..."; 
caseO; easel; case2; ...; 

caseO{ 


easel{ 


case2{ 

...;} 

•••» .z - 

} % end of switch 

else 

% default action 


Here MIN and MAX are manifest constants specifying the valid limits of the branch index 
(which of course must be even). The group labelled switch permits the case labels to be kept lo¬ 
cal to this particular switch, so they can be used repeatedly. Default actions, if any, are provided 
by the else clause. 
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